特征提取是图分析中的重要任务。这些特征向量(称为图形描述符)用于基于下游矢量空间的图形分析模型。过去证明了这个想法,基于光谱的图形描述符提供了最新的分类准确性。但是,要计算有意义的描述符的已知算法不会扩展到大图,因为:(1)它们需要将整个图存储在内存中,并且(2)最终用户无法控制算法的运行时。在本文中,我们提出流算法以大约计算三个不同的图形描述符,以捕获图的基本结构。在边缘流上操作使我们避免将整个图存储在内存中,并控制样本大小使我们能够将算法的运行时间保持在所需的范围内。我们通过分析近似误差和分类精度来证明所提出的描述符的功效。我们的可扩展算法计算图形的描述符,并在几分钟之内具有数百万个边缘。此外,这些描述符得出的预测精度可与最新方法相当,但只能使用25%的记忆来计算。
translated by 谷歌翻译
Neuromorphic vision or event vision is an advanced vision technology, where in contrast to the visible camera that outputs pixels, the event vision generates neuromorphic events every time there is a brightness change which exceeds a specific threshold in the field of view (FOV). This study focuses on leveraging neuromorphic event data for roadside object detection. This is a proof of concept towards building artificial intelligence (AI) based pipelines which can be used for forward perception systems for advanced vehicular applications. The focus is on building efficient state-of-the-art object detection networks with better inference results for fast-moving forward perception using an event camera. In this article, the event-simulated A2D2 dataset is manually annotated and trained on two different YOLOv5 networks (small and large variants). To further assess its robustness, single model testing and ensemble model testing are carried out.
translated by 谷歌翻译
Although recent deep learning-based calibration methods can predict extrinsic and intrinsic camera parameters from a single image, their generalization remains limited by the number and distribution of training data samples. The huge computational and space requirement prevents convolutional neural networks (CNNs) from being implemented in resource-constrained environments. This challenge motivated us to learn a CNN gradually, by training new data while maintaining performance on previously learned data. Our approach builds upon a CNN architecture to automatically estimate camera parameters (focal length, pitch, and roll) using different incremental learning strategies to preserve knowledge when updating the network for new data distributions. Precisely, we adapt four common incremental learning, namely: LwF , iCaRL, LU CIR, and BiC by modifying their loss functions to our regression problem. We evaluate on two datasets containing 299008 indoor and outdoor images. Experiment results were significant and indicated which method was better for the camera calibration estimation.
translated by 谷歌翻译
The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert behavior can lead to poor results when given sub-optimal experts. We compare several common approaches for skill transfer on multiple domains including changes in task and system dynamics. We identify how existing methods can fail and introduce an alternative approach to mitigate these problems. Our approach learns to sequence existing temporally-extended skills for exploration but learns the final policy directly from the raw experience. This conceptual split enables rapid adaptation and thus efficient data collection but without constraining the final solution.It significantly outperforms many classical methods across a suite of evaluation tasks and we use a broad set of ablations to highlight the importance of differentc omponents of our method.
translated by 谷歌翻译
Disentanglement of constituent factors of a sensory signal is central to perception and cognition and hence is a critical task for future artificial intelligence systems. In this paper, we present a compute engine capable of efficiently factorizing holographic perceptual representations by exploiting the computation-in-superposition capability of brain-inspired hyperdimensional computing and the intrinsic stochasticity associated with analog in-memory computing based on nanoscale memristive devices. Such an iterative in-memory factorizer is shown to solve at least five orders of magnitude larger problems that cannot be solved otherwise, while also significantly lowering the computational time and space complexity. We present a large-scale experimental demonstration of the factorizer by employing two in-memory compute chips based on phase-change memristive devices. The dominant matrix-vector multiply operations are executed at O(1) thus reducing the computational time complexity to merely the number of iterations. Moreover, we experimentally demonstrate the ability to factorize visual perceptual representations reliably and efficiently.
translated by 谷歌翻译
在社交媒体中发现进攻性语言是社交媒体面临的主要挑战之一。研究人员提出了许多高级方法来完成这项任务。在本报告中,我们尝试利用他们的方法中的学习,并结合我们的想法以改进它们。我们在对进攻推文分类中成功实现了74%的准确性。我们还列出了社交媒体界的滥用内容检测中的即将到来的挑战。
translated by 谷歌翻译
目标:探索深度学习算法进一步简化和优化尿道板(UP)质量评估的能力,使用板客观评分工具(POST),旨在提高Hypospadias修复中提高评估的客观性和可重复性。方法:五个关键的邮政地标是由专家在691图像数据集中的专家标记,该数据集接受了原发性杂质修复的青春期前男孩。然后,该数据集用于开发和验证基于深度学习的地标检测模型。提出的框架始于瞥见和检测,其中输入图像是使用预测的边界框裁剪的。接下来,使用深层卷积神经网络(CNN)体系结构来预测五个邮政标记的坐标。然后,这些预测的地标用于评估远端催化性远端的质量。结果:所提出的模型准确地定位了gan区域,平均平均精度(地图)为99.5%,总体灵敏度为99.1%。在预测地标的坐标时,达到了0.07152的归一化平均误差(NME),平均平方误差(MSE)为0.001,在0.1 nme的阈值下为20.2%的故障率。结论:此深度学习应用程序在使用邮政评估质量时表现出鲁棒性和高精度。使用国际多中心基于图像的数据库进行进一步评估。外部验证可以使深度学习算法受益,并导致更好的评估,决策和对手术结果的预测。
translated by 谷歌翻译
这项研究开发了一个无人驾驶系统(UASS)的框架,以监测高层建筑项目中未受保护的边缘和开口附近的跌落危险系统。开发并测试了一个三步基于机器学习的框架,以检测UAS捕获的图像的护栏柱。首先,对护栏探测器进行了培训,以定位支撑护栏的职位的候选位置。由于从实际的工作现场收集的此过程中使用了图像,因此确定了几个错误检测。因此,在以下步骤中引入了其他约束,以滤除错误检测。其次,研究团队将水平线检测器应用于图像,以正确检测地板并删除离地板不近的检测。最后,由于每个帖子之间安装了护栏柱,它们之间的分布差异大致,因此它们之间的空间被估算并用于找到两个帖子之间最有可能的距离。研究团队使用了开发方法的各种组合来监视高层建筑项目的捕获图像中的护栏系统。比较精度和召回指标表明,级联分类器通过落地检测和护栏间距估计来取得更好的性能。研究结果表明,拟议的护栏识别系统可以改善护栏的评估,并促进安全工程师确定高层建筑项目中跌落危害的任务。
translated by 谷歌翻译
从同一场景的单个或多个低分辨率图像中获取高分辨率图像的过程对于现实世界图像和信号处理应用非常感兴趣。这项研究是关于探索基于深度学习的图像超分辨率算法的潜在用法,用于为驾驶汽车内车辆驾驶员监测系统产生高质量的热成像结果。在这项工作中,我们提出并开发了一种新型的多图像超分辨率复发性神经网络,以增强分辨率并提高从未冷却的热摄像机捕获的低分辨率热成像数据的质量。端到端完全卷积神经网络在室内环境条件下从刮擦上训练了30个不同受试者的新获得的热数据。热调谐超分辨率网络的有效性已定量验证,以及在6个不同受试者的测试数据上进行定性验证。该网络能够在验证数据集上达到4倍超分辨率的平均峰信号与噪声比为39.24,在定量和质量上都超过了双色插值。
translated by 谷歌翻译
本文提议使用修改的完全连接层转移初始化,以进行1900诊断。卷积神经网络(CNN)在图像分类中取得了显着的结果。但是,由于图像识别应用程序的复杂性,培训高性能模型是一个非常复杂且耗时的过程。另一方面,转移学习是一种相对较新的学习方法,已在许多领域使用,以减少计算来实现良好的性能。在这项研究中,Pytorch预训练的模型(VGG19 \ _bn和WideresNet -101)首次在MNIST数据集中应用于初始化,并具有修改的完全连接的层。先前在Imagenet中对使用的Pytorch预培训模型进行了培训。提出的模型在Kaggle笔记本电脑中得到了开发和验证,并且在网络培训过程中没有花费巨大的计算时间,达到了99.77%的出色精度。我们还将相同的方法应用于SIIM-FISABIO-RSNA COVID-19检测数据集,并达到80.01%的精度。相比之下,以前的方法在训练过程中需要大量的压缩时间才能达到高性能模型。代码可在以下链接上找到:github.com/dipuk0506/spinalnet
translated by 谷歌翻译